Methods for quantifying microrna precursors

ABSTRACT

The present invention is directed to methods, reagents, kits and compositions for identifying and quantifying microRNA (miRNA) precursor expression in a biological sample. The method uses gene-specific primers and reverse transcriptase to convert the primary miRNA precursors (pri-miRNA) and pre-miRNA precursors (pre-miRNAs) to cDNA. The method also uses amplification reactions using gene specific forward and reverse primers that are targeted to the hairpin sequence of pri- and pre-microRNA precursors to detect the expression levels of both the pri- and the pre-micoRNAs. In one embodiment, the amplification reaction is a real-time PCR wherein the level of PCR amplification products produced is related to the levels of the microRNA precursors in the biological sample. In another embodiment, a probe is used to distinguish between similar isoforms of microRNA precursors. In another embodiment, the expression levels of a pre-miRNA precursor is calculated by using primers and amplification reactions that detect the pri-miRNA together with amplification reactions and primers that detect both pri- and pre-miRNAs, and calculating the difference.

REFERENCE TO RELATED APPLICATION

This application claims priority from U.S. Provisional Application Ser. No. 60/656,109, filed Feb. 24, 2005, the entire content of which is incorporated herein by reference.

STATEMENT OF GOVERNMENT SUPPORT

This invention is supported, at least in part, by Grant No. CA107435 from the National Institutes of Health, USA. The U.S. government has certain rights in this invention.

BACKGROUND

Mature microRNAs¹ (miRNAs) are endogenous, ˜21 nucleotide (nt), non-coding RNAs whose primary function is believed to be translational repression of protein coding mRNAs. The mature miRNA is processed from longer precursor molecules by the enzymes drosha and Dicer. ¹The abbreviations used are: cDNA, complementary DNA; C. elegans, Caenorhabditis elegans; CLL, chronic lymphocytic leukemia; LMW, low molecular weight; miRNA, microRNA; mRNA, messenger RNA; nt, nucleotides; PCR, polymerase chain reaction; pri-miRNA, primary microRNA precursor; RT-PCR, reverse transcriptase polymerase chain reaction; Tm, melting temperature.

miRNAs have been found in C. elegans, Drosophila, plants, mice and humans, suggesting an ancient and widespread role for these non-coding RNAs. To date, over 3,500 miRNAs have been discovered, including 114 in C. elegans, 332 in humans and 270 in mice. An algorithm termed miRscan was developed to predict the number of miRNAs in a genome based upon the phylogenetically conserved foldback structure of the miRNA. miRscan predicts that the total number of miRNAs in the human genome to be 200-255, or about 1% of the predicted genes in humans.

The founding members of the miRNA class of genes, lin-4 and let-7, are expressed temporally during development of C. elegans. In addition to regulating development in C. elegans, miRNAs have been shown to negatively regulate the proapoptotic gene hid during Drosophila development. Thus, levels of miRNA or miRNA precursors in samples taken from C. elegans or Drosophila can be used to determine the stage of development of these two organisms.

miRNAs are also associated with various diseases. For example, two human miRNAs (miR-15a and miR-16) have been mapped to the region 13q14 that is commonly deleted in chronic lymphocytic leukemia (CLL). The expression of miR-15a and miR-16 is reduced in CLL patients with loss of heterozygosity at 13q14.

Thus, the levels of miRNA and their precursors in samples taken from a test subject, including human subjects, can be used to study the role of miRNAs in health and disease and to identify drugs that modulate miRNA function.

Most of the miRNA expression data published to date have used Northern blotting to detect both the mature and pre-miRNA precursors. Probes designed to hybridize to the mature miRNA detect the ˜22 nt mature miRNA and the ˜75 nt pre-miRNA simultaneously on the blot. Primer extension has also been effectively used to detect the mature miRNA. As tools for monitoring gene expression, gel based assays (Northern blotting, primer extension, RNase protection assays, etc.) have disadvantages, including low throughput and poor sensitivity.

cDNA microarrays are an alternative to Northern blotting to quantify miRNAs since microarrays have excellent throughput. For example, a recent report used cDNA microarrays to monitor the expression of miRNAs during neuronal development. Microarrays have other disadvantages including the necessity for high concentrations of input target for efficient hybridization and signal generation, poor sensitivity for rare targets and the necessity for post-array validation using more sensitive assays such as real-time PCR.

A PCR approach has been used to determine the expression levels of mature miRNAs. This method, while useful to clone miRNAs, is impractical for routine gene expression studies since it involves gel isolation of small RNAs and ligation to linker oligonucleotides. PCR has also been used to measure the expression of primary miRNA precursor molecules.

Because of the short size of miRNAs and the sequence similarity between miRNA family members, new and different methods are needed to detect and quantify their expression. Additionally, it is desirable to analyze the expression levels of miRNA precursors. For example, miRNA precursor levels can provide an indirect method of analyzing the expression levels of mature miRNAs. Studying the differential expression of different miRNA precursors as compared to the mature miRNA is itself of interest. For example, certain disease processes may interfere with different steps during the processing of miRNA precursors.

Therefore, a need exists for a high throughput method that allows for the simultaneous analysis of miRNA precursor molecules and that provides for the analysis of miRNA expression when only small amounts of starting material are available.

SUMMARY OF THE PRESENT INVENTION

In general, the invention relates to methods and compositions for identifying the expression of both pri-microRNA and pre-microRNA precursors in a sample. The method involves detection of a portion of the hairpin sequence that is shared by both the pri-miRNA and the pre-miRNA In a first aspect, the method uses an initial step where a gene-specific reverse primer is used to reverse transcribe the targeted portion of the hairpin sequence. In another aspect, the method uses gene-specific forward and reverse primers in an amplification reaction to amplify the targeted portion of the hairpin sequence.

In another aspect, the invention features a method for identifying differential expression of hairpin-containing microRNA precursors in a test sample. The method includes (a) performing an amplification reaction on the test sample to amplify a target nucleotide sequence wherein the target nucleotide sequence includes a portion of the hairpin sequence that is longer than the mature microRNA sequence, (b) detectably labeling the target nucleotide sequence, and (c) detecting a difference between the amount of the detectably labeled target nucleotide sequence present in the test sample relative to a corresponding control.

In another aspect, the invention features a method for detecting a first microRNA precursor in a sample that contains at least a second microRNA precursor that is an isoform of the first microRNA precursor so that the first and second microRNA precursors have hairpin sequences that contain substantially similar primer portions. The method includes performing an amplification reaction on the sample to produce a first amplification product, containing the hairpin sequence of the first microRNA precursor, and a second amplification product containing the hairpin sequence of the second microRNA precursor. The amplification reaction is performed using a forward primer and a reverse primer targeted to the substantially similar primer portions of the hairpin sequences of the first and the second microRNA precursors. The method also includes detecting only the first amplification product using a sequence-specific detection probe targeted to a sequence that is unique to the hairpin sequence of the first microRNA precursor, wherein the unique sequence lies between the substantially similar primer portions of the hairpin sequences.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1. miRNA processing and primer design. miRNAs such as human miR-18 are transcribed as a (A) large primary precursor (pri-miRNA) that is processed by the nuclear enzyme Drosha to produce the (B) putative 62 nt precursor miRNA (pre-miRNA). Both the pri-miRNA and pre-miRNA contain the hairpin structure. The underlined portion of the pre-miRNA represents the sequence of the (C) 22 nt mature miRNA that is processed from the pre-miRNA by the ribonuclease Dicer. Gray line denotes forward primer; Black line denotes reverse primer; Dashed line denotes sense primer used along with the reverse (black) primer to amplify the pri-miRNA only.

FIG. 2. Amplification of short hairpins by the PCR. HeLa cell genomic DNA was amplified by the PCR using primers for miR-124a-2 (lane 1), miR-93-1 (lane 2), let7-d (lane 3), miR-15a (lane 4), miR-16 (lane 5) and miR-147 (lane 6) and resolved on a 2.2% agarose gel. M, 25 bp DNA ladder.

FIG. 3. Optimal reverse transcription conditions for small RNAs. Total RNA was isolated from HCT-116 cells, a fraction of which was further purified to contain a low molecular weight (LMW) fraction of <160 nt. One μg of total or LMW RNA was converted to cDNA using Thermoscript reverse transcriptase and random hexamers (open bars) or gene specific primers (stripped bars). The resulting cDNA was amplified by real-time PCR using primers for (A) let7d, miR-15a or (B) U6 RNA. Mean±SD, triplicate PCRs from a single cDNA.

FIG. 4. Real-time PCR of miRNA precursors. Gene specific primers were designed to the hairpin of the miR-21 and let-7d miRNA precursors. The cDNA from human cancer cell lines was amplified by real-time PCR and SYBR® green detection. (A) Real-time, PCR plots of HCT-8 cDNA using miR-21 primers (blue plot, C_(T)=32.8) and let-7d primers (red plot, C_(T)=29.7). Also shown are the signals that were generated from the no template control reactions (olive plots) and the no reverse transcription control reactions (purple plots). (B) Dissociation curve generated from the heat dissociation protocol that followed the real-time PCR shown in (A). The presence of one peak on the thermal dissociation plot corresponds to a single amplicon from the PCR. The plot colors in (B) match those described in (A).

FIG. 5. Pri-miRNA and pre-miRNA expression in human cancer cell lines. (A) Total RNA from HeLa cells was converted to cDNA using gene specific primers as described in Materials and Methods. The cDNA was amplified by real-time PCR using primers that anneal to the hairpin present in both the pri-miR-18 and pre-miR-18 (C_(T)=26.6) or to the pri-miRNA only (C_(T)=27.6). (B) Total miR-18 precursor expression (pri-miRNA+pre-miRNA) and individual expression (pri-miRNA or pre-miRNA) in six cancer cell lines. Mean of duplicate real-time PCRs from a single cDNA sample.

FIG. 6. miRNA precursor expression in human cancer cell lines. The expression of the miRNA precursors for miR-93-1 (A), miR-147 (B), miR-24-2 (C) and miR-29 (D) in six human tumor cell lines and Drosophila S2 cells was determined by real-time PCR. Gene expression is presented relative to U6 RNA. Mean±SD of triplicate real-time PCRs from a single cDNA sample. * Undectectable expression.

FIG. 7. miRNA precursor expression in the human colorectal cancer cell line HCT-116. The expression of 23 miRNA precursors was determined in the human colorectal cancer cell line HCT-116 by real-time PCR. Gene expression is presented relative to U6 RNA. Mean±SD of triplicate real-time PCRs from a single cDNA sample. * Undetectable expression.

FIG. 8. Treeview analysis of real-time PCR data. The expression of 23 miRNA precursors and U6 RNA was determined in 6 human cancer cell lines by real-time PCR. The relative expression of each gene (mean of triplicate real-time PCRs from a single cDNA sample) was determined as described in Materials and Methods. A median expression value equal to one was designated black. Red shading indicates increased levels of expression and green shading represents decreased levels of expression relative to the median. Gray color, undetectable expression. Data is presented on a logarithmic scale.

FIG. 9. Validation of real-time PCR data by Northern blotting. (A) The precursor expression for miR-29, -21 and -224 relative to U6 RNA was determined by real-time PCR in HL-60, HeLa and HCT-116 cDNA (mean±SD triplicate RNA isolations/reverse transcriptions). (B) Northern blot of the ˜22 nt mature miRNA and the ˜75 nt pre-miRNA in the same cell lines shown in (A). The blots were stripped and re-probed for U6 RNA. P, pre-miRNA, M, mature miRNA.

FIG. 10. Table representing the intra-assay variation from replicate RNA isolations.

FIG. 11. Table showing the efficiency of amplification for miRNA genes using U6 RNA.

FIG. 12. Primer and TaqMan® probe sequences to let-7 miRNA isoforms. (A) The sequences of the miRNA precursors for the members of the human let-7 family of miRNA isoforms. Line above sequence, sequence of the mature miRNA; Dashed underlined, sequences of the forward PCR primers; Boxed, sequences of the reverse PCR primers; Bold: priming sequences that differ among isoforms. Also shown are sequences of the human let-7 family mature miRNAs. (B) The sequence of the TaqMan® MGB probe is double-underlined. Sequences are in the 5′ to 3′ direction.

FIG. 13. Real-time PCR of miRNA precursor isoforms. The sequences of six miRNA precursor isoforms (let-7a-1, let-7a-2, let-7a-3, let-7f-1, let-7f-2 and let-7d) were cloned into plasmids. Real-time PCR was attempted on seven different reactions (in triplicate) containing each plasmid and primers specific to each isoform. Each reaction contained the TaqMan® MGB probe for let-7d. Only the reaction containing the let-7d plasmid gave a detectable signal (A). Following the real-time PCR, a portion of each reaction was run on an agarose gel to demonstrate that PCR had occurred in each reaction (B). NTC, No template control. M, 100 bp DNA ladder.

FIG. 14. Heatmap of miRNA precursor expression in 32 human cancer cell lines. The names of the 32 cancer cell lines are listed on the top of the figure. The names of the miRNAs that were profiled in the cancer cell lines are listed to the right of the figure. The relative expression of each gene was determined by real-time PCR; data are presented as ΔCT. Unsupervised hierarchical clustering was performed using PCR primers to 201 miRNA precursors. Data were unfiltered prior to clustering. A median expression value equal to one was designated black; red increased expression; green, reduced expression; gray, undetectable expression. (B) Dendrogram of clustering analysis.

FIG. 15. PCR Primers used to amplify the human miRNAs precursors. p, primers to miRNA primary precursor sequence. All other primers hybridize to hairpin present in both the primary precursor and precursor miRNA.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides methods and compositions for the analysis of microRNA precursor expression and will now be described with reference to more detailed examples. The examples illustrate how a person skilled in the art can make and use the invention, and are described here to provide enablement and best mode of the invention without imposing limitations that are not recited in the claims.

All publications, patent applications, patents, internet web pages and other references mentioned herein are expressly incorporated by reference in their entirety. When the definitions of terms in incorporated references appear to differ from the definitions provided in the present teachings, the definitions provided in the present teachings shall control.

Unless otherwise indicated, all numbers expressing quantities of ingredients, reaction conditions, and so forth used in the specification and claims are to be understood as being modified in all instances by the term “about.” Accordingly, unless indicated to the contrary, the numerical parameters set forth in the following specification and attached claims are approximations that may vary depending upon the desired properties sought to be obtained by the present invention. At the very least, and not as an attempt to limit the application of the doctrine of equivalents to the scope of the claims, each numerical parameter should be construed in light of the number of significant digits and ordinary rounding approaches.

Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the invention are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical value, however, inherently contains certain errors necessarily resulting from the standard deviation found in their respective testing measurements. Every numerical range given throughout this specification will include every narrower numerical range that falls within such broader numerical range, as if such narrower numerical ranges were all expressly written herein.

The following introduction is useful for understanding the terms used in this description. Without being bound to the following theory and with reference to FIG. 1, it is believed that miRNAs are encoded by genes that are transcribed into single or clustered miRNA precursors. These miRNA precursors are converted to mature forms of miRNAs through a stepwise processing, as depicted in FIG. 1. It is believed that the processing first generates (A) a large ˜70 nucleotide (nt) primary precursor, referred to herein as a “pri-miRNA,” that is then processed by the nuclear enzyme Drosha to produce (B) a putative ˜62 nt precursor, referred to herein as a “pre-miRNA.” Both the pri-miRNA and pre-miRNA contain a characteristic hairpin structure. The underlined portion of the pre-miRNA sequence in FIG. 1 represents the sequence of (C) the ˜22 nt mature miRNA that is processed from the pre-miRNA by the ribonuclease Dicer. Thus the term “pri-microRNA” refers to molecule A, “pre-miRNA” refers to molecule B, “mature miRNA” refers to molecule C, and “miRNA precursor(s)” refers to both pri- and pre-miRNAs (i.e. molecules A and B), as shown in FIG. 1.

Still referring to FIG. 1, both the pri-miRNA and pre-miRNA molecules have a “hairpin sequence,” which is an oligonucleotide sequence having a first half which is at least partially complementary to a second half thereof, thereby causing the halves to fold onto themselves, forming a “hairpin structure.” The hairpin structure is typically made of a “stem” part, which consists of the complementary or partially complementary sequences, and a “loop” part, which is a region located between the two complementary strands of the stem, as depicted in FIG. 1.

Provided herein are methods and kits for detecting the expression levels of both the pri- and the pre-miRNA precursor in a test sample using gene-specific primers targeted to a portion of the hairpin sequence shared by both the pri- and the pre-miRNA precursors. The term “target nucleotide sequence” or “target nucleotide” as used herein, refers to the polynucleotide sequence that is sought to be detected, i.e. the sequence that is targeted by the gene-specific primers of the present invention. The target nucleotide sequence, as used herein, comprises a portion of the hairpin sequence which is shared by both the pri- and the pre-miRNA precursors and may comprise the entire hairpin sequence. Alternative, the target nucleotide sequence may comprise only a portion of the hairpin sequence which portion is substantially longer than the mature miRNA sequence and is typically about 70 nucleotides long. In either case, the target nucleotide sequence may include a few nucleotides beyond the hairpin sequence, so long as the target nucleotide sequence is shared by the pri- and pre-miRNA precursors. Target nucleotide sequence is intended to include DNA (e.g., cDNA or genomic DNA), RNA, analogs of the DNA or RNA generated using nucleotide analogs, and derivatives, fragments and homologs thereof.

The methods described generally employ a two step approach. First, the target nucleotide sequence of the miRNA precursors is reverse transcribed into cDNA using a gene-specific reverse primer and a thermostable reverse transcriptase. Second, the target nucleotide sequence cDNA is amplified and detected, thereby simultaneously detecting the expression levels of both the pri- and the pre-miRNA molecules in the test sample. Alternatively, the methods may be applied directly on genomic DNA without the need for reverse transcription.

In some embodiments, the target nucleotide sequence cDNA acts as a template in an amplification reaction. Amplification products are then detected using detection probes. As used herein, the term “amplifying” or “amplification reaction” refers to any means by which at least a part of a target polynucleotide, target polynucleotide surrogate, or combinations thereof, is reproduced, typically in a template-dependent manner, including without limitation, a broad range of techniques for amplifying nucleic acid sequences, either linearly or exponentially. Exemplary means for performing an amplifying step include PCR, primer extension, ligase chain reaction (LCR), ligase detection reaction (LDR), ligation followed by Q-replicase amplification, strand displacement amplification (SDA), hyperbranched strand displacement amplification, multiple displacement amplification (MDA), nucleic acid strand-based amplification (NASBA), two-step multiplexed amplifications, rolling circle amplification (RCA), in vitro transcription using a forward primer containing a promoter sequence for RNA polymerase and the like, including multiplex versions or combinations thereof. Descriptions of such techniques can be found in, among other places, Sambrook et al. Molecular Cloning, 3rd Edition; Ausbel et al.; PCR Primer: A Laboratory Manual, Diffenbach, Ed., Cold Spring Harbor Press (1995); The Electronic Protocol Book, Chang Bioscience (2002), Msuih et al., J. Clin. Micro. 34:501-07 (1996); The Nucleic Acid Protocols Handbook, R. Rapley, ed., Humana Press, Totowa, N.J. (2002); Abramson et al., Curr Opin Biotechnol. 1993 February; 4(1):41-7, U.S. Pat. No. 6,027,998; U.S. Pat. No. 6,605,451, Barany et al., PCT Publication No. WO 97/31256; Wenz et al., PCT Publication No. WO 01/92579; Day et al., Genomics, 29(1): 152-162 (1995), Ehrlich et al., Science 252:1643-50 (1991); Innis et al., PCR Protocols: A Guide to Methods and Applications, Academic Press (1990); Favis et al., Nature Biotechnology 18:561-64 (2000); and Rabenau et al., Infection 28:97-102 (2000); Belgrader, Barany, and Lubin, Development of a Multiplex Ligation Detection Reaction DNA Typing Assay, Sixth International Symposium on Human Identification, 1995 (available on the world wide web at: promega.com/geneticidproc/ussymp6proc/blegrad.html); LCR Kit Instruction Manual, Cat. #200520, Rev. #050002, Stratagene, 2002; Barany, Proc. Natl. Acad. Sci. USA 88:188-93 (1991); Bi and Sambrook, Nucl. Acids Res. 25:2924-2951 (1997); Zirvi et al., Nucl. Acid Res. 27:e40i-viii (1999); Dean et al., Proc Natl Acad Sci USA 99:5261-66 (2002); Barany and Gelfand, Gene 109:1-11 (1991); Walker et al., Nucl. Acid Res. 20:1691-96 (1992); Polstra et al., BMC Inf. Dis. 2:18-(2002); Lage et al., Genome Res. 2003 February; 13(2):294-307, and Landegren et al., Science 241:1077-80 (1988), Demidov, V., Expert Rev Mol. Diagn. 2002 November; 2(6):542-8., Cook et al., J Microbiol Methods. 2003 May; 53(2):165-74, Schweitzer et al., Curr Opin Biotechnol. 2001 February; 12(1):21-7, U.S. Pat. No. 5,830,711, U.S. Pat. No. 6,027,889, U.S. Pat. No. 5,686,243, Published P.C.T. Application WO0056927A3, and Published P.C.T. Application WO9803673A1. In some embodiments, newly-formed nucleic acid duplexes are not initially denatured, but are used in their double-stranded form in one or more subsequent steps. An extension reaction is an amplifying technique that comprises elongating a gene-specific primer that is annealed to a template and extended in the 5′ to 3′ direction using an amplifying means such as a polymerase and/or reverse transcriptase. According to some embodiments, with appropriate buffers, salts, pH, temperature, and nucleotide triphosphates, including analogs thereof, i.e., under appropriate conditions, a polymerase incorporates nucleotides complementary to the template strand starting at the 3′-end of an annealed linker probe, to generate a complementary strand. In some embodiments, the polymerase used for extension lacks or substantially lacks 5′ exonuclease activity. In some embodiments of the present teachings, unconventional nucleotide bases can be introduced into the amplification reaction products and the products treated by enzymatic (e.g., glycosylases) and/or physical-chemical means in order to render the product incapable of acting as a template for subsequent amplifications. In some embodiments, uracil can be included as a nucleobase in the reaction mixture, thereby allowing for subsequent reactions to decontaminate carryover of previous uracil-containing products by the use of uracil-N-glycosylase (see for example Published P.C.T. Application WO9201814A2). In some embodiments of the present teachings, any of a variety of techniques can be employed prior to amplification in order to facilitate amplification success, as described for example in Radstrom et al., Mol. Biotechnol. 2004 February; 26(2):13346. In some embodiments, amplification can be achieved in a self-contained integrated approach comprising sample preparation and detection, as described for example in U.S. Pat. Nos. 6,153,425 and 6,649,378. Reversibly modified enzymes, for example but not limited to those described in U.S. Pat. No. 5,773,258, are also within the scope of the disclosed teachings. The present teachings also contemplate various uracil-based decontamination strategies, wherein for example uracil can be incorporated into an amplification reaction, and subsequent carry-over products removed with various glycosylase treatments (see for example U.S. Pat. No. 5,536,649, and U.S. Provisional Application 60/584,682 to Andersen et al.,). Those in the art will understand that any protein with the desired enzymatic activity can be used in the disclosed methods and kits. Descriptions of DNA polymerases, including reverse transcriptases, uracil N-glycosylase, and the like, can be found in, among other places, Twyman, Advanced Molecular Biology, BIOS Scientific Publishers, 1999; Enzyme Resource Guide, rev. 092298, Promega, 1998; Sambrook and Russell; Sambrook et al.; Lehninger; PCR: The Basics; and Stoflet E S, Koeberl D D, Sarkar G, Sommer S S, Genomic amplification with transcript sequencing, Science 239(4839):491-4 (1988).

In some embodiments, detector probes are used to detect amplified target nucleotides. As used herein, the term “detector probe” refers to a molecule used in an amplification reaction, typically for quantitative or real-time PCR analysis, as well as endpoint analysis. Such detector probes can be used to monitor the amplification of the target polynucleotide. In some embodiments, detector probes present in an amplification reaction are suitable for monitoring the amount of amplicon(s) produced as a function of time. Such detector probes include, but are not limited to, the 5′-exonuclease assay (TaqMan® probes described herein (see also U.S. Pat. No. 5,538,848) various stem-loop molecular beacons (see e.g., U.S. Pat. Nos. 6,103,476 and 5,925,517 and Tyagi and Kramer, 1996, Nature Biotechnology 14:303-308), stemless or linear beacons (see, e.g., WO 99/21881), PNA Molecular Beacons® (see, e.g., U.S. Pat. Nos. 6,355,421 and 6,593,091), linear PNA beacons (see, e.g., Kubista et al., 2001, SPIE 4264:53-58), non-FRET probes (see, e.g., U.S. Pat. No. 6,150,097), Sunrise®/Amplifluor® probes (U.S. Pat. No. 6,548,250), stem-loop and duplex Scorpion™ probes (Solinas et al., 2001, Nucleic Acids Research 29:E96 and U.S. Pat. No. 6,589,743), bulge loop probes (U.S. Pat. No. 6,590,091), pseudo knot probes (U.S. Pat. No. 6,589,250), cyclicons (U.S. Pat. No. 6,383,752), MGB Eclipse™ probe (Epoch Biosciences), hairpin probes (U.S. Pat. No. 6,596,490), peptide nucleic acid (PNA) light-up probes, self-assembled nanoparticle probes, and ferrocene-modified probes described, for example, in U.S. Pat. No. 6,485,901; Mhlanga et al., 2001, Methods 25:463-471; Whitcombe et al., 1999, Nature Biotechnology. 17:804-807; Isacsson et al., 2000, Molecular Cell Probes. 14:321-328; Svanvik et al., 2000, Anal Biochem. 281:26-35; Wolffs et al., 2001, Biotechniques 766:769-771; Tsourkas et al., 2002, Nucleic Acids Research. 30:4208-4215; Riccelli et al., 2002, Nucleic Acids Research 30:4088-4093; Zhang et al., 2002 Shanghai. 34:329-332; Maxwell et al., 2002, J. Am. Chem. Soc. 124:9606-9612; Broude et al., 2002, Trends Biotechnol. 20:249-56; Huang et al., 2002, Chem. Res. Toxicol. 15:118-126; and Yu et al., 2001, J. Am. Chem. Soc 14:11155-11161. Detector probes can also comprise quenchers, including without limitation black hole quenchers (Biosearch), Iowa Black (IDT), QSY quencher (Molecular Probes), and Dabsyl and Dabcel sulfonate/carboxylate Quenchers (Epoch). Detector probes can also comprise two probes, wherein for example a fluor is on one probe, and a quencher is on the other probe, wherein hybridization of the two probes together on a target quenches the signal, or wherein hybridization on the target alters the signal signature via a change in fluorescence. Detector probes can also comprise sulfonate derivatives of fluorescenin dyes with SO₃ instead of the carboxylate group, phosphoramidite forms of fluorescein, phosphoramidite forms of CY 5 (commercially available for example from Amersham). In some embodiments, interchelating labels are used such as ethidium bromide, SYBR® Green I (Molecular Probes), and PicoGreen® (Molecular Probes), thereby allowing visualization in real-time, or end point, of an amplification product in the absence of a detector probe. In some embodiments, real-time visualization can comprise both an intercalating detector probe and a sequence-based detector probe can be employed. In some embodiments, the detector probe is at least partially quenched when not hybridized to a complementary sequence in the amplification reaction, and is at least partially unquenched when hybridized to a complementary sequence in the amplification reaction. In some embodiments, probes further comprise various modifications such as a minor groove binder (see for example U.S. Pat. No. 6,486,308) to further provide desirable thermodynamic characteristics.

In some embodiments, the target nucleotide cDNA can be detected using a variety of hybridization techniques. As used herein, the term “hybridization” refers to the complementary base-pairing interaction of one nucleic acid with another nucleic acid that results in formation of a duplex, triplex, or other higher-ordered structure, and is used herein interchangeably with “annealing.” Typically, the primary interaction is base specific, e.g., A/T and G/C, by Watson/Crick and Hoogsteen-type hydrogen bonding. Base-stacking and hydrophobic interactions can also contribute to duplex stability. Conditions for hybridizing detector probes and primers to complementary and substantially complementary target sequences are well known, e.g., as described in Nucleic Acid Hybridization, A Practical Approach, B. Hames and S. Higgins, eds., IRL Press, Washington, D.C. (1985) and J. Wetmur and N. Davidson, Mol. Biol. 31:349 et seq. (1968). In general, whether such annealing takes place is influenced by, among other things, the length of the polynucleotides and the complementary sequence, the pH, the temperature, the presence of mono- and divalent cations, the proportion of G and C nucleotides in the hybridizing region, the viscosity of the medium, and the presence of denaturants. Such variables influence the time required for hybridization. Thus, the preferred annealing conditions will depend upon the particular application. Such conditions, however, can be routinely determined by the person of ordinary skill in the art without undue experimentation. It will be appreciated that complementarity need not be perfect; there can be a small number of base pair mismatches that will minimally interfere with hybridization between the target sequence and the single stranded nucleic acids of the present teachings. However, if the number of base pair mismatches is so great that no hybridization can occur under minimally stringent conditions then the sequence is generally not a complementary target sequence. Thus, complementarity herein is meant that the probes or primers are sufficiently complementary to the target sequence to hybridize under the selected reaction conditions to achieve the ends of the present teachings. Novel hybridization techniques, such as bead-based flow cytometry (described for example in Lu J, et al., MicroRNA expression profiles classify human cancers, Nature. 2005 Jun. 9; 435(7043):834-8) are also contemplated by the present teachings.

In some embodiments, the 3′ gene-specific primer can be used in an extension reaction. As used herein, the term “extension reaction” refers to an elongation reaction in which the 3′ gene-specific primer is extended to form an extension reaction product comprising a strand complementary to the target polynucleotide. In some embodiments, the target polynucleotide is a portion of the hairpin sequence common to both a pri- and a pre-miRNA molecule and the extension reaction is a reverse transcription reaction comprising a reverse transcriptase. In some embodiments, the extension reaction is a reverse transcription reaction comprising a polymerase derived from a Eubacteria. In some embodiments, the extension reaction can comprise rTth polymerase. It will be appreciated that the use of polymerases that also comprise reverse transcription properties can allow for some embodiments of the present teachings to comprise a first reverse transcription reaction followed thereafter by an amplification reaction, thereby allowing for the consolidation of two reactions in essentially a single reaction. In some embodiments, the consolidation of the extension reaction and a subsequent amplification reaction is further contemplated by the present teachings.

As used herein, the term “detection” refers to any of a variety of ways of determining the presence and/or quantity and/or identity of a target polynucleoteide. In some embodiments employing a donor moiety and signal moiety, one may use certain energy-transfer fluorescent dyes. Certain nonlimiting exemplary pairs of donors (donor moieties) and acceptors (signal moieties) are illustrated, e.g., in U.S. Pat. Nos. 5,863,727; 5,800,996; and 5,945,526. Use of some combinations of a donor and an acceptor have been called FRET (Fluorescent Resonance Energy Transfer). In some embodiments, fluorophores that can be used as signaling probes include, but are not limited to, rhodamine, cyanine 3 (Cy 3), cyanine 5 (Cy 5), fluorescein, Texas Red (Molecular Probes) and the group Vic™, Liz™, Tamra™, 5-Fam™, 6-Fam™ (all available from Applied Biosystems, Foster City, Calif.). In some embodiments, the amount of detector probe that gives a fluorescent signal in response to an excited light typically relates to the amount of nucleic acid produced in the amplification reaction. Thus, in some embodiments, the amount of fluorescent signal is related to the amount of product created in the amplification reaction. In such embodiments, one can therefore measure the amount of amplification product by measuring the intensity of the fluorescent signal from the fluorescent indicator. According to some embodiments, one can employ an internal standard to quantify the amplification product indicated by the fluorescent signal. See, e.g., U.S. Pat. No. 5,736,333. Devices have been developed that can perform a thermal cycling reaction with compositions containing a fluorescent indicator, emit a light beam of a specified wavelength, read the intensity of the fluorescent dye, and display the intensity of fluorescence after each cycle. Devices comprising a thermal cycler, light beam emitter, and a fluorescent signal detector, have been described, e.g., in U.S. Pat. Nos. 5,928,907; 6,015,674; and 6,174,670, and include, but are not limited to the ABI Prism® 7700 Sequence Detection System the ABI GeneAmp® Sequence Detection System series (all available from Applied Biosystems, Foster City, Calif.), or the LightCycler® (Roche Diagnosites, Indianapolis, Ind.) In some embodiments, each of these functions can be performed by separate devices. In some embodiments, combined thermal cycling and fluorescence detecting devices can be used for precise quantification of target nucleic acid sequences in samples. In some embodiments, fluorescent signals can be detected and displayed during and/or after one or more thermal cycles, thus permitting monitoring of amplification products as the reactions occur in “real time.” In some embodiments, one can use the amount of amplification product and number of amplification cycles to calculate how much of the target nucleic acid sequence was in the sample prior to amplification. In some embodiments, one could simply monitor the amount of amplification product after a predetermined number of cycles sufficient to indicate the presence of the target nucleic acid sequence in the sample. One skilled in the art can easily determine, for any given sample type, primer sequence, and reaction condition, how many cycles are sufficient to determine the presence of a given target polynucleotide. As used herein, determining the presence of a target can comprise identifying it, as well as optionally quantifying it.

In some embodiments, different detector probes may distinguish between different target polynucleoteides. A non-limiting example of such a probe is a 5′-nuclease fluorescent probe, such as a TaqMan® probe molecule, wherein a fluorescent molecule is attached to a fluorescence-quenching molecule through an oligonucleotide link element. In some embodiments, the oligonucleotide link element of the 5′-nuclease fluorescent probe binds to a specific sequence of an identifying portion or its complement. In some embodiments, different 5′-nuclease fluorescent probes, each fluorescing at different wavelengths, can distinguish between different amplification products within the same amplification reaction. For example, in some embodiments, one could use two different 5′-nuclease fluorescent probes that fluoresce at two different wavelengths (WL_(A) and WL_(B)) and that are specific to two different hairpin sequences of two different extension reaction products (A′ and B′, respectively). Amplification product A′ is formed if target nucleic acid sequence A is in the sample, and amplification product B′ is formed if target nucleic acid sequence B is in the sample. In some embodiments, amplification product A′ and/or B′ may form even if the appropriate target nucleic acid sequence is not in the sample, but such occurs to a measurably lesser extent than when the appropriate target nucleic acid sequence is in the sample. After amplification, one can determine which specific target nucleic acid sequences are present in the sample based on the wavelength of signal detected and their intensity. Thus, if an appropriate detectable signal value of only wavelength WL_(A) is detected, one would know that the sample includes target nucleic acid sequence A, but not target nucleic acid sequence B. If an appropriate detectable signal value of both wavelengths WL_(A) and WL_(B) are detected, one would know that the sample includes both target nucleic acid sequence A and target nucleic acid sequence B. In some embodiments, detection can occur through any of a variety of mobility dependent analytical techniques based on differential rates of migration between different analyte species. Exemplary mobility-dependent analysis techniques include electrophoresis, chromatography, mass spectroscopy, sedimentation, e.g., gradient centrifugation, field-flow fractionation, multi-stage extraction techniques, and the like. In some embodiments, mobility probes can be hybridized to amplification products, and the identity of the target polynucleotide determined via a mobility dependent analysis technique of the eluted mobility probes, as described for example in Published P.C.T. Application WO04/46344 to Rosenblum et al., and WO01/92579 to Wenz et al. In some embodiments, detection can be achieved by various microarrays and related software such as the Applied Biosystems Array System with the Applied Biosystems 1700 Chemiluminescent Microarray Analyzer and other commercially available array systems available from Affymetrix, Agilent, Illumina, and Amersham Biosciences, among others (see also Gerry et al., J. Mol. Biol. 292:251-62, 1999; De Bellis et al., Minerva Biotec 14:247-52, 2002; and Stears et al., Nat. Med. 9:14045, including supplements, 2003). It will also be appreciated that detection can comprise reporter groups that are incorporated into the reaction products, either as part of labeled primers or due to the incorporation of labeled dNTPs during an amplification, or attached to reaction products, for example but not limited to, via hybridization tag complements comprising reporter groups or via linker arms that are integral or attached to reacton products. Detection of unlabeled reaction products, for example using mass spectrometry, is also within the scope of the current teachings.

The term “corresponding” as used herein refers to a specific relationship between the elements to which the term refers. Some non-limiting examples of corresponding include: a gene-specific forward or reverse primer can correspond with a target polynucleotide, and vice versa. A detector probe can correspond with a particular region of a target polynucleotide and vice versa. In some cases, the corresponding elements can be complementary. In some cases, the corresponding elements are not complementary to each other, but one element can be complementary to the complement of another element. The term corresponding is also used when referring to the pri-miRNA and the pre-miRNA molecules that belong to one miR gene.

The term “sample” as used herein refers to any sample that contains the target nucleotide sequence and can be obtained from any organism known to contain miRNA encoding genes. In certain examples, the sample is obtained from a mammal, such as a human or mouse. In other embodiments, the sample is derived from other organisms, such as a plant, C. elegans or drosophila. It will be appreciated that the target nucleotide sequence can be isolated from samples using any of a variety of procedures known in the art.

The term “pair of primers targeted to” a target nucleotide sequence refers to forward and reverse primers that can anneal to either end of the target nucleotide sequence. It is appreciated by those skilled in the art that a forward (or sense) primer can usually directly hybridize to a first primer portion located at the 5′ end of the target nucleotide sequence, while a reverse (or anti-sense) primer can hybridize to the complement of the second primer portion located at the 3′ end of the target nucleotide sequence.

The term “upstream” as used herein takes on its customary meaning in molecular biology, and refers to the location of a region of a polynucleotide that is on the 5′ side of a “downstream” region. Correspondingly, the term “downstream” refers to the location of a region of a polynucleotide that is on the 3′ side of an “upstream” region.

In the presented examples,

detection of hairpin-containing miRNA precursor levels is achieved by (a) converting the pri- and pre-miRNA precursors to cDNA using a gene-specific reverse primer and a reverse transcriptase, and (b) amplifying and detecting a portion of the hairpin sequence common to the pri- and pre-miRNA precursors. In each amplification reaction, the forward and reverse primers are targeted to amplify a substantial portion of the hairpin sequence. In one example, the forward primer is targeted to a sequence located at the 5′ end of the hairpin structure and the reverse primer is targeted to a sequence located at the 3′ end of the hairpin structure.

Appropriate primers can be designed using the following criteria. Both forward and reverse primers are designed to be located within or substantially within the hairpin sequence of the miRNA precursors (FIG. 1). The pre-miRNA sequences are predicted based upon the fold-back structure. Sequences of known precursor miRNA precursor species are available on the miRNA registry (http://www.sanger.ac.uk/Software/Rfam/mirna/index.shtml) (Griffiths-Jones, S., The microRNA Registry. Nucleic Acids Res, 2004. 32(1): p. D109-11.) An extension of about 4 nucleotides is allowed for each primer over the presumed 5′ or 3′ termini of the pre-miRNA. It is understood, however, that different length primers can be used as long as the target nucleotide amplified by the primers is shared between the pri- and the pre-miRNA precursors. Since the hairpin is contained within both the pri-miRNA and the pre-miRNA, primers designed to the hairpin simultaneously amplify both RNAs. Primers are designed with a maximal T_(m) difference between both primers of ≦2° C. and an optimal primer length between 16-24 nucleotides for primers composed of the identical chemical composition of natural DNA. The primers have a Tm range of 48-62° C., preferably 49-59° C., and more preferably 55-59° C. Suitable primers for quantifying levels of certain precursor miRNAs in test samples obtained from human subjects include, but are not limited to, the primers shown in Table 1.

In one example, the method uses gene-specific primers and a thermostable reverse transcriptase to convert the hairpin of the miRNA precursors to cDNA. The cDNA is subsequently amplified using real-time PCR with SYBR® green detection.

Amplification reactions such as PCR, RT-PCR and real-time PCR are well known in the art. Briefly, in PCR, two primer sequences are prepared which are complementary to regions on opposite complementary strands of, for example, a target nucleic acid. An excess of deoxynucleoside triphosphates (dNTPs) are added to a reaction mixture along with a DNA polymerase, e.g., Taq polymerase. If the target sequence is present in a sample, the primers will bind to the target and the polymerase will cause the primers to be extended by adding on nucleotides. Each nucleotide incorporated results in the generation of a molecule of the targeted nucleic acid. By raising and lowering the temperature of the reaction mixture, the extended primers will dissociate from the nucleic acid template to form amplification products, excess primers will bind to the targeted nucleic acid and to the amplification products and the process is repeated.

Although amplification and analysis of the PCR products can be performed sequentially, in “real-time” PCR assays, amplification and analysis occur simultaneously. DNA dyes or fluorescent probes can be added to the PCR mixture before amplification and used to analyze PCR products during amplification. Sample analysis occurs concurrently with amplification in the same tube within the same instrument. This combined approach decreases sample handling, saves time, and greatly reduces the risk of product contamination for subsequent reactions, as there is no need to remove the samples from their closed containers for further analysis. See, for example, U.S. Pat. No. 6,174,670, incorporated herein by reference.

The differences in various real-time PCR protocols rests in methods for generating a fluorescence signal with the amplification product. Many different probes are available for monitoring PCR. Although not sequence specific, double stranded DNA (dsDNA) specific dyes can be used in any amplification without the need for probe synthesis. Such dyes include ethidium bromide and SYBR® Green I. With dsDNA dyes, product specificity can be increased by analysis of melting curves or by acquiring fluorescence at a high temperature where nonspecific products have melted. See, for example, Ririe K M, Rasmussen R P and C T Wittwer, Product differentiation by analysis of DNA melting curves during the polymerase chain reaction, Anal. Biochem. 245-154-160 (1997); Morrison T B, J&J Weis and C T Wittwer, Quantification of low copy transcripts by continuous SYBR® Green I monitoring during amplification, BioTechniques 24:954-962 (1998).

Oligonucleotide probes can also be covalently labeled with fluorescent molecules. For example, hairpin probes (Molecular Beacons®) and exonuclease probes (TaqMan®) are dual-labeled oligonucleotides that can be monitored during PCR. Another example is the TaqMan® minor groove binder probe. These probes depend on fluorescence quenching of a fluorophore by a quencher on the same oligonucleotide. Fluorescence increases when hybridization or exonuclease hydrolysis occurs.

Molecular beacons have a hairpin structure wherein the quencher dye and reporter dye are in intimate contact with each other at the end of the stem of the hairpin. Upon hybridization with a complementary sequence, the loop of the hairpin structure becomes double stranded and forces the quencher and reporter dye apart, thus generating a fluorescent signal. Tyagi et al. reported use of the non-fluorescent quencher dyes including the dabcyl (4-{[4-(dimethylamino)phenyl]diazenyl}benzoyl moiety, absorbance max=453 nm) used in combination with fluorescent reporter dyes of widely varying emission wavelength (475-615 nm). See Tyagi S, Brat u D P, Kramer F R, Multi color molecular beacons for allele discrimination, Nat. Biotechnol. 1:49-53 (1998).

Another format for “real-time” PCR uses DNA probes which are referred to as “5′-nuclease” (or TaqMan®) probes (Lee et al., Nucl. Acid Res. 21:3761-3766 (1993)). These fluorogenic probes are typically prepared with the quencher at the 3′ terminus of a single DNA strand and the fluorophore at the 5′ terminus. During each PCR cycle, the 5′-nuclease activity of Taq DNA polymerase cleaves the DNA strand, thereby separating the fluorophore from the quencher and releasing the fluorescent signal. The 5′-nuclease assay requires that the probe be hybridized to the template strand during the primer extension step (60-65° C.). It is also possible to effect simultaneous “real-time” detection of more than one polynucleotide sequence in the same assay, using more than one fluorophore/quencher pair.

The TaqMan® minor groove binder (MGB) assay utilizes a hydrolysis probe that has a fluorophore on one end of the probe. The fluorophore may be one of the following chemicals: TAMRA, TET, JOE, VIC or NED. The other end of the probe has the TaqMan® minor groove binder/quencher. The hydrolysis probe (TaqMan® MGB) assay takes advantage of the 5′-nuclease ability of DNA polymerase to hydrolyze the fluorophore and minor groove binder from the probe to produce a signal. The hydrolysis probe methods offer an additional degree of specificity. Methods of preparing such probes are described in U.S. Pat. Nos. 5,801,155; 6,790,945; 6,699,975, and 6,653,473, all of which are incorporated herein in their entirety. The use of the TaqMan® minor groove binder probe is especially appealing in the present invention because the presence of the minor groove binder increases the Tm of the probes and allows for the design of shorter probes that are beneficial for the detection of miRNA precursors since there is only a small region in between the sense and antisense primer.

In real-time PCR, reagents generate a fluorescence signal proportional to the number of amplicons produced by the PCR process. Real-time PCR is based upon the principle that, the more template initially present, the fewer number of cycles are necessary to reach exponential phase where the fluorescence signal rises above the background signal. This point, called the threshold cycle (C_(T)), occurs during the exponential phase and is proportional to the initial template concentration. Thus a standard curve can be generated with gene copy numbers as a function of the threshold cycle to permit quantification of unknown samples without any post-amplification sample processing.

In “real-time quantitative” PCR, the accumulation of amplification products is measured continuously in both standard dilutions of target RNA and samples containing unknown amounts of target RNA. A standard curve is constructed by correlating initial template concentration in the standard samples with the number of PCR cycles (Ct) necessary to produce a specific threshold concentration of product. In the test samples, target PCR product accumulation is measured after the same C_(T), which allows interpolation of target DNA concentration from the standard curve. Another method, often referred to as “relative quantitative PCR,” determines the relative concentrations of specific nucleic acids.

In one example of the present invention, a real-time quantitative PCR assay is used to monitor the expression of miRNA precursors. The method comprises amplifying the targeted hairpin sequence of the miRNA precursor species through a plurality of amplification cycles in the presence of the fluorescent entity, measuring fluorescence intensity of the fluorescent entity at each of the plurality of amplification cycles to produce a fluorescent value for each cycle related to the quantity of the miRNA precursor species present at each cycle, obtaining a score from each of a plurality of tests, each of the plurality of tests using the fluorescence values to generate the score, and using the scores to ascertain whether the miRNA precursor species is present in the sample and to quantity the miRNA precursor in the test sample. The levels of the miRNA precursor species can be quantified in comparison with an internal standard, for example, levels of a synthetic miRNA precursor of the identical sequence. As described above, the methods for quantitative PCR and variations thereof are well known to those of ordinary skill in the art.

In another example, real-time PCR is used to determine the amount of pre-miRNA precursors only. This method uses a second set of primers, e.g. the reverse primer to the hairpin structure (black primer, FIG. 1) and a new forward primer designed to anneal to a sequence upstream of the hairpin sequence (dashed primer, FIG. 1). In this method, PCR using the hairpin primers (gray/black, FIG. 1) amplifies the pri-miRNA+pre-miRNA, and PCR using the upstream primer along with the reverse hairpin primer (dashed/black, FIG. 1) amplifies only the pri-miRNA. The amount of pre-miRNA is then calculated using the following equation:

pre-miRNA=2^(−C) ^(hd T) ^((pri-miRNA+pre-miRNA))−2^(−C) ^(T) ^(pri-mRNA).  (FIG. 5).

In another example, TaqMan® minor groove binder probes are used to discriminate nearly identical members of a family of miRNA isoforms. (FIGS. 12, 13)

In another example, the assay is adapted and expanded to include primers to some 200 human miRNA precursors. (FIG. 15). In this example, miRNA precursor expression is profiled in 32 human cell lines from lung, breast, colorectal, hematologic, prostate, pancreatic and head and neck cancers.

The invention may also comprise one or more kits to perform any of the methods described herein. In one embodiment the kit comprises one or more primer pairs that target the hairpin region of one or more precursor miRNAs. In another embodiment, the kit may further comprise a hairpin specific primer and/or a gene specific primer that targets a region in the primary miRNA that is substantially upstream or downstream of the hairpin sequence. In another embodiment, the kit may comprise, alone or in combination with other regents, a gene-specific reverse primer to a sequence within the hairpin structure to be used to reverse transcribe the hairpin sequence of miRNA precursor to cDNA. In a non-limiting example, primers, enzymes for reverse transcription, and enzymes for amplification may be included in the kit. The kits may also comprise agents for RNA isolation, purification of amplification products, labels, etc.

The components of the kits may be packaged either in aqueous media or in lyophilized form. The suitable container means of the kits will generally include at least one vial, test tube, flask, bottle, syringe or other container means, into which a component may be placed, and preferably, suitably aliquoted. Where there are more than one component in the kit, the kit also will generally contain a second, third or other additional container into which the additional components may be separately placed. However, various combinations of components may be comprised in a vial. The kits of the present invention may also include a means for containing the reagent containers in close confinement for commercial sale. Such containers may include injection or blow-molded plastic containers into which the desired vials are retained.

While the present teachings have been described in terms of the following examples, those skilled in the art will readily understand that numerous variations and modifications of these examples are possible without undue experimentation. All such variations and modifications are within the scope of the current teachings. Aspects of the present teachings may be further understood in light of the following examples, which should not be construed as limiting the scope of the claims in any way.

Example 1 Quantification of miRNA Precursors in Human Cancer Cell Lines Materials and Methods

Cell Lines and Tissue Culture.

The following human tumor cell lines were obtained from American Type Culture Collection (Manassas, Va.). K-562 (chronic myelogenous leukemia), HL-60 (promyelocytic leukemia), LNCaP (prostate cancer), HeLa (cervical adenocarcinoma), HCT-8 (colorectal cancer) and HCT-116 (colorectal cancer). S2 Drosophila cells were purchased from Invitrogen (Carlsbad, Calif.). All cancer cell lines were cultured in a humidified atmosphere of 95% air, 5% CO₂ using RPMI 1640 or other suitable media and 10% fetal bovine scrum. S2 cells were cultured at room temperature according to Invitrogen's protocol.

RNA, DNA Extraction and Reverse Transcription.

Total RNA was extracted from the cultured cells using TRIZOL (Invitrogen, Carlsbad, Calif.) per the manufacturer's protocol. The concentration of total RNA was quantified by the absorbance at 260 nm. Total RNA was briefly exposed to RNAase-free DNAase I as previously described by Calin, G. A., et al., Frequent deletions and down-regulation of micro-RNA genes miR15 and miR16 at 13q14 in chronic lymphocytic leukemia. Proc Natl Acad Sci USA, 2002. 99(24): p. 15524-9. RNA was reverse transcribed to cDNA using either random hexamers or gene specific primers and Thermoscript, thermostable reverse transcriptase (Invitrogen). A 1 μg aliquote of DNase treated total RNA (10.5 μl total volume) was incubated with 1.5 μl of a cocktail containing 10 μM of each of the antisense primers listed in Table 1. The reaction was heated to 80° C. for 5 min to denature the RNA, then incubated for 5 min at 60° C. to anneal the primers. The reactions were cooled to room temperature and the remaining reagents (5× buffer, dNTPs, DTT, RNase inhibitor, Thermoscript) were added as specified in the Thermoscript protocol and the reaction proceeded for 45 min at 60° C. Finally, the reverse transcriptase was inactivated by a 5 min incubation at 85° C. For the random hexamer primed cDNA, RNA plus 0.25 μl of random primers (Invitrogen) was denatured at 80° C. for 5 min and cooled to room temperature for 10 min to allow the hexamers to anneal. The additional reagents were then added and the reaction proceeded as described above. The minus reverse transcription controls were treated identically as described above except the reactions lacked Thermoscript and primers. Genomic DNA was isolated from HeLa cells as previously described in Sharma, R. C., A. J. Murphy, M. G. DeWald, and R. T. Schimke, A rapid procedure for isolation of RNA-free genomic DNA from mammalian cells. Biotechniques, 1993. 14(2): p. 176-8.

Gene Expression in Low Molecular Weight RNA Fraction.

Total RNA was isolated from HCT-116 cells using Trizol. Seven hundred μg of total RNA was loaded to the Midi RNA isolation column (Qiagen, Valencia, Calif.). Isolation of low molecular weight (LMW) RNA (approximately 160 nt and less) was achieved following the manufacturer's protocol, including eluting the LMW RNA using buffer QRW2 (750 mM NaCl, 50 mM MOPS, pH 7.0, 15% (v/v) ethanol). Total and LMW RNA were resolved on a denaturing 15% polyacrylamide gel to validate the isolation. One μg of the LMW and total RNA was reverse transcribed to cDNA using Thermoscript and random hexmers or gene specific primers as described above. The cDNA was assayed by real-time PCR using primers for six different miRNA genes and U6 RNA.

Northern Blotting.

Northern blotting was performed as previously reported in Lau, N. C., et al., An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science, 2001. 294(5543): p. 858-62. Briefly, total RNA (30 μg) was resolved on 15% polyacrylamide urea gels and transferred to Genescreen Plus membranes (Perkin Elmer, Boston, Mass.). Oligonucleotides complementary to the mature miRNA were end-labeled with [γ ³²P] ATP and T4 kinase. The membranes were incubated with labeled probe (1.5×10⁶ c.p.m./ml hybridization buffer) prior to visualization using phosphorimaging. Blots were stripped once and re-probed using an oligonucleotide complementary to U6 RNA.

Primer Design, PCR and Validation.

All primers were designed using Primer Express version 2.0 (Applied Biosystems, Foster City, Calif.). The following criteria were used during the primer design. Both sense and antisense primers were designed to be located within the hairpin sequence of the miRNA precursors (FIG. 1). The pre-miRNA sequences are predicted based upon the fold-back structure (described in Griffiths-Jones, S., The microRNA Registry. Nucleic Acids Res, 2004. 32(1): p. D109-11; and Ambros, V., et al., A uniform system for microRNA annotation. Rna, 2003. 9(3): p. 277-279). Mapping the 5′ and 3′ cleavage sites of miR-30a demonstrated that the termini of pre-miR-30a are identical to those of mature 30a and 30a* (see Lee, Y., et al., The nuclear RNase III Drosha initiates microRNA processing. Nature, 2003. 425(6956): p. 415-9). It was presumed here that all pre-miRNAs are processed from the pri-miRNA in this manner. A maximal extension of about 4 nt was allowed for each primer over the presumed 5′ or 3′ termini of the pre-miRNA. Since the hairpin is contained within both the pri-miRNA and the pre-miRNA, primers designed to the hairpin simultaneously amplify both RNAs. We use the term ‘miRNA precursors’ here to be inclusive of both the pri-miRNA and the pre-miRNA. Primers were designed with a maximal T_(m) difference between both primers of ≦2° C. and a primer length between 18-24 nts. An ideal Tm of 55-59° C. was selected for the primers, however due to size constraints, some primers were designed with a T_(m) that was below 55° C. The T_(m) range of all the pre-miRNA primers was 49-59° C., and the median Tm was 56° C. (Table 3). Additional criteria included no 3′ GC clamps, and a minimal amplicon size of about 55 bp.

PCR amplicons were validated using gel electrophoresis (2.2% agarose or 15% polyacrylamide) and by the presence of one peak on the thermal dissociation curve generated by the thermal denaturing protocol that followed each real-time PCR run (see Schmittgen, T. D., et al., Quantitative reverse transcription-polymerase chain reaction to study mRNA decay: comparison of endpoint and real-time methods. Anal Biochem, 2000. 285(2): p. 194-204). The sequences of the miRNA precursor amplicons were determined by subcloning the PCR product generated by amplifying HeLa cell cDNA into TOPO TA cloning vectors (Invitrogen) per the manufacturer's protocol. Plasmid purification and automated DNA sequencing of the plasmids were performed using standard techniques.

TABLE 1 PCR Primers used to amplify human miRNAs precursors. Tm primers Gene Forward primer (5′ -k 3′) Reverse primer (5′→3′) (Forward/Rev) U6 CTCGCTTCGGCAGCACA AACGCTTCACGAATTTGCGT 59/59 let-7d AACGCTTCACGAATTTGCGT AAGGCAGCAGGTCGTATAGT 55/53 miR-15a GTAGCAGCACATAATGGTTTGTG GCAGCACAATATGGCCTG 56/55 miR-16 GCAGCACGTAAATATTGGCGT CAGCAGCACAGTTAAATACTGGAG 59/57 miR-18 TAAGGTGCATCTAGTGCAGATAG GAAGGAGCACTTAGGGCAGT 53/55 miR-20 GCACTAAAGTGCTTATAGTGCAG GTACTTTAAGTGCTCATAATGCA 53/51 miR-21 GCTTATCAGACTGATGTTGACTG CAGCCCATCGACTGGTG 53/55 miR-24-2 CTCCCGTGCCTACTGAGCT CCCTGTTCCTGCTGAACTGAG 57/59 miR-28 GGAGCTCACAGTCTATTGAGTTACC CCTCCAGGAGCTCACAATCT 56/56 miR-29 ATGACTGATTTCTTTTGGTG ATAACCGATTTCAGATGGTG 49/51 miR-30a GTAAACATCCTCGACTGGAAGCT GCTGCAAACATCCGACTGAA 58/58 miR-30d GTTGTTGTAAACATCCCCGAC GCAGCAAACATCTGACTGAAAG 56/56 miR-33 TGTGGTGCATTGTAGTTGCA CTGTGATGCACTGTGGAAAC 56/54 miR-92-1 TCTACACAGGTTGGGATCGG CGGGACAAGTGCAATACCATA 57/57 miR-93-1 AAGTGCTGTTCGTGCAGG CTCGGGAAGTGCTAGCTCA 55/55 miR-101 GCCCTGGCTCAGTTATC ACA GCCATCCTTCAGTTATCACAGTA 57/55 miR-105-1 CAAATGCTCAGACTCCTGTGGT GCACATGCTCAAACATCCGT 58/58 miR-107 CAGCTTCTTTACAGTGTTGCCT GATAGCCCTGTACAATGCTGC 56/56 miR-124a- TCCGTGTTCACAGCGGAC CATTCACCGCGTGCCTTA 58/58 miR-147 CTAAAGACAACATTTCTGCACAC ATCTAGCAGAAGCATTTCCAC 53/53 miR-216 TGGCTTAATCTCAGCTGGCA TGAGGGCTAGGAAATTGCTCT 58/58 miR-219 TCCTGATTGTCCAAACGCAA GGGACGTCCAGACTCAACTCTC 59/59 miR-220 CCACACCGTATCTGACACTTT CAGACCGCATCATGAACAC 54/54 miR-224 GGCTTTCAAGTCACTAGTGGTTC CTTTGTAGTCACTAGGGCACCA 56/56

Real-Time Quantitative PCR.

Real-time quantitative PCR was performed using standard protocols on an Applied Biosystem's 7900HT Sequence Detection System. Briefly 5 μl of a 1/100 dilution of cDNA in water was added to 12.5 μl of the 2× SYBR® green PCR master mix (Applied Biosystems), 800 nM of each primer and water to 25 μl. The reactions were amplified for 15 sec at 95° C. and 1 min at 60° C. for 40 cycles. The thermal denaturation protocol was run at the end of the PCR to determine the number of products that were present in the reaction. All reactions were run in triplicate and included no template and no reverse transcription controls for each gene. The cycle number at which the reaction crossed an arbitrarily-placed threshold (C_(T)) was determined for each gene and the relative amount of each miRNA to U6 RNA was described using the equation 2^(−ΔC) ^(T) where ΔC_(T)=(C_(TmiRNA)−C_(TU6RNA)) (See Livak, K. J. and T. D. Schmittgen, Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods, 2001. 25(4): p. 402-8.) Relative gene expression was multiplied by 10⁶ in order to simplify the presentation of the data.

Calculation of PCR Efficiency.

PCR efficiency was determined as previously described in Mygind, T., et al., Determination of PCR efficiency in chelex-100 purified clinical samples and comparison of real-time quantitative PCR and conventional PCR for detection of Chlamydia pneumoniae. BMC Microbiol, 2002. 2(1): p. 17, from the equation N=N₀×E^(n), where N is the number of amplified molecules, N₀ is the initial number of molecules, n is the number of PCR cycles and E is the efficiency, which is ideally 2. When the equation is of the form n=−(1/log E)×log N₀+(log N/log E), a plot of log copy number versus C_(T) yields a straight line with a slope=−(1/log E). To experimentally determine PCR efficiency, 10-fold dilutions of HeLa cell genomic DNA were diluted over 4-logs. The diluted genomic DNA was amplified by real-time PCR using the identical conditions established for the gene expression analysis. Plots were made of the log of the template concentration versus the C_(T) and the PCR efficiency was calculated from the slope of the line using the equation described above. Actual concentration of template is not needed when determining the efficiency as it depends only upon the slope of the line.

Treeview Analysis of PCR Data.

The expression of each miRNA relative to U6 RNA was converted to pseudocolors and plotted using the Treeview cluster analysis as previously reported in Dittmer, D. P., Transcription Profile of Kaposi's Sarcoma-associated Herpesvirus in Primary Kaposi's Sarcoma Lesions as Determined by Real-Time PCR Arrays. Cancer Res, 2003. 63(9): p. 2010-5; Fakhari, F. D. and D. P. Dittmer, Charting latency transcripts in Kaposi's sarcoma-associated herpesvirus by whole-genome real-time quantitative PCR. J Virol, 2002. 76(12): p. 6213-23; Eisen, M. B., et al., Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA, 1998. 95(25): p. 14863-8). Expression that had a value equal to 1 was designated black, expression that was greater than 1 was designated red and expression that was less than one was designated as green. Genes with undetectable expression were designated as gray.

Results

Validation of PCR Primers.

To amplify the miRNA precursors, PCR primers were designed to anneal to the hairpin (FIG. 1). Amplification of short hairpins by the PCR could present a challenge because of the competition between annealing of the primer and reformation of the hairpin. Primers were designed to 23 different pre-miRNA genes using the criteria described in Materials and Methods. Shown in FIG. 2 are the products from amplifying HeLa cell genomic DNA using six of the miRNA precursor primers. All six reactions produced amplicons of the expected size with no additional products. All of the primer pairs listed in Table 3 met the criteria of one peak on the thermal dissociation curve and a single band of the correct size on either agarose or polyacrylamide gels. As a further validation, the amplicon from pre-miR-147 was sublconed and sequenced. Comparison of the sequence data verified that 100% of the new sequence was amplified. These results demonstrate our ability to successfully amplify short hairpins using the PCR.

Validation of Reverse Transcription Conditions.

Our initial attempts to reverse transcribe total RNA used random hexamer priming. Real-time PCR of the resulting cDNA using the primers listed in Table 1 produced varied PCR signals in the cell lines tested, i.e., miRNA precursors were expressed at high, intermediate and low levels (data not shown). It later occurred to us that it may be very difficult to prime the pre-miRNA with random hexamers. This is because pre-miRNAs are very short (<80 nt) and the stoichiometry of primer annealing should be much less than that of a primer binding to a larger RNA such as mRNA. Furthermore, a competition exists between the annealing of the random primers to the pre-miRNA and hairpin formation, which is compounded by the low temperatures (25° C.) at which random primers are typically annealed. We hypothesized that the PCR signal generated from amplifying cDNA primed with random hexamers was due to amplifying the much longer pri-miRNA and not the pre-miRNA.

To test this hypothesis, a LMW RNA fraction was isolated from total RNA. The LMW RNA fraction contains RNA <160 nt and should separate the pre-miRNA (˜75 nt) from the larger pri-miRNA. Denaturing polyacrylamide gel electrophoresis verified that RNA <160 nt was recovered in the LMW fraction (not shown). Both the LMW and total RNA was primed with random hexamers or gene specific primers and reverse transcribed using the Thermoscript reverse transcriptase.

In order to determine the effectiveness of priming the reverse transcriptions, real-time PCR was performed on the cDNA using primers for two miRNAs (let7d and miR-15a) as well as U6 RNA. Total RNA primed with random hexamers produced less cDNA compared to total RNA primed with gene specific primers (FIG. 3). More LMW RNA was converted to cDNA when primed with gene specific primers compared to random hexamers (FIG. 3). Even for the 106 nt U6 RNA that does not contain any hairpins, higher yields of cDNA were achieved using the gene-specific priming compared with random hexamers (FIG. 3B). We conclude that reverse transcription proceeds through secondary structure such as hairpins if priming occurs at some point upstream of the hairpin. However, to prime short RNA molecules, in particular small RNAs containing hairpins, gene-specific primers and not random primers should be used.

Intra-Assay Variation.

To evaluate the intra-assay variation of the real-time PCR assay, flasks of HeLa, HCT-116 and HL-60 cells were cultured in triplicate. Total RNA was isolated from the cultures. A 1 μg aliquot of the total RNA was converted to cDNA as described in Materials and Methods. The relative expression of the precursors for miR-18, -107 and -29 were determined using the real-time PCR assay. The mean, standard deviation and coefficient of variation from the triplicate RNA isolations/reverse transcription are shown in FIG. 10. The coefficient of variation among the different genes and cell lines was quite low, ranging from 1.8 to 34.5%.

Real-Time PCR of miRNA Precursors.

An important issue for quantitative PCR is that the efficiency of amplification for each gene in the study (including the internal control) should be very similar and be close to the ideal value of 2. Although amplicon lengths were very similar and all the miRNA genes contained the hairpin, large differences in the T_(m) existed among the primers (Table 1). PCR efficiency was determined on the U6 RNA as well as six miRNA genes, two with a low T_(m) (49-53° C.), two with an intermediate T_(m) (55-56° C.) and two with a higher T_(m) (58-59° C.). The efficiency of all seven genes was very similar and was close to the ideal value of 2 (FIG. 11). There was no trend of altered efficiency with T_(m) in these genes.

The cDNA of K-562, HL-60, LNCaP, HeLa, HCT-8, HCT-116 and Drosophila S2 cells were amplified by the PCR using primers for 23 miRNA precursors. Moderate to strong PCR signals were generated when cDNA was used as a template for most of the miRNA precursor primers. PCR amplicons were not generated from the S2 cDNA, or in the no template or no reverse transcription controls. In the cases where expression of the miRNA genes was very low (C_(T)≧35), the primers were validated on HeLa cell genomic DNA. This was done in order to determine if the weak signal generated by amplifying cDNA was due to the primers not working or to the lack of template in the cDNA (i.e. the gene was not expressed).

The reproducibility of real-time PCR tends to become worse when very low copies of template are amplified. For this reason, the following criteria were used to calculate the mean relative gene expression and to distinguish between low and undetectable expression. If three out of three PCRs were above the threshold after 40 cycles and the thermal dissociation profiles of all three reactions matched, the C_(T) of all three plots was used in the relative expression calculation. If two out of three PCRs were above the threshold after 40 cycles and the thermal dissociation plots of the two reactions matched, then the PCR that was below threshold was discarded and the mean of the remaining two was used in the relative expression calculation. If only one or no PCRs out of three were above threshold after 40 cycles, then the expression of the gene was classified as ‘undetectable’.

Representative real-time amplification plots of the pre-miRNA are shown in FIG. 4. Shown are the PCR plots for miR-21 and let-7d in HCT-8 cDNA (FIG. 4A). Strong signals were generated for both genes when cDNA template was amplified but not on the no template or no reverse transcription controls. The thermal dissociation curves generated at the end of the real-time PCR run demonstrate that the miR-21 and let-7d primers amplified a single product that was different from the products generated on the negative controls (FIG. 4B). FIG. 4 demonstrates how the dissociation curves may be used to distinguish true PCR amplicons from the noise that is often generated by amplifying no template controls or low copies of template.

miRNA Precursor Expression in Human Cancer Cell Lines.

The relative expression of 23 miRNA precursors was determined in six human cancer cells lines. Expression data on four miRNA precursor genes are shown in FIG. 6. The expression of miR-93-1 in the HCT-116 colorectal cancer cell line was 50-fold higher than in the HCT-8 colorectal cancer cell line (FIG. 6A). miR-24-2 was expressed at relatively constant levels in all of the cell lines except in Hela cells which expressed between 5 to 10-fold higher levels (FIG. 6C). The colorectal cancer cell lines and HeLa cells expressed higher levels of miR-29 and miR-147 compared to the blood cancers and prostate cancer cell lines (FIGS. 6B and D). The expression of the 23 miRNA precursors varied within a particular cell type such as HCT-116 (FIG. 7). The difference in expression of the miRNA precursors varied over 4,000-fold within this cell line. miR-21 had the highest level of expression and miR-30a, the lowest level of expression. The expression of four miRNAs (miR-20, -28, -33 and -216) was undectectable expression in HCT-116 cells.

The relative expression of the miRNA precursors was presented using the Treeview algorithm (FIG. 8). This allowed visualization of large amounts of data in a single figure. The relative gene expression values were multiplied by 10⁶. Median expression was set equal to the value of one and was indicated by the color black (FIG. 8). Increased (red), decreased (green) and no expression (gray) were plotted relative to the median value. Although some exceptions existed, miRNA precursor expression across the cell lines were more or less similar (i.e. expression was either high, intermediate, low or undectectable in each of the six cell lines). This type of analysis allows for the easy identification of individual genes with very different expression within the group. For example HCT-116 and HeLa cells expressed much higher levels of miR-21 than the other cell lines and miR-224 was undetectable only in HL-60 cells.

Validation of Real-Time PCR Results with Northern Blotting.

Northern blotting is currently the established method to monitor miRNA expression. In order to validate our real-time PCR data, Northern blotting was performed on the total RNA from three different cells lines using probes for miR-29, -21 and -224. miRNAs were selected that demonstrated high expression by our PCR assay and that had diverse expression among the cell lines. The trend in expression between the mature miRNAs as detected by Northern blotting and the miRNA precursors as detected by PCR was identical (FIG. 9). The pre-miRNA was visible by Northern blotting only for miR-21 (HeLa and HCT-116) and miR-224 (HeLa). While strong amplification was generated on miR-29 (HCT-116), no pre-miRNA band was visible by Northern blotting. We were unable to detect any miRNA (mature or precursor) using a probe for miR-220. While Northern blots were attempted using probes for only four miRNAs, the lower limit of detection by Northern blotting was a relative expression value of 0.25×10⁶ by the PCR assay. This suggests that most of the miRNAs labeled as green in FIG. 8 would be undetectable by Northern blotting in our hands and substantiates the enhanced sensitivity of the PCR compared to Northern blotting.

DISCUSSION

As an alternative to Northern blotting, we developed a real-time PCR assay to quantify the expression of the miRNA precursors. The short hairpins of 23 of 29 genes attempted were successfully converted to cDNA and amplified using standard real-time PCR methods. For this reason we believe that the assay may be expanded to include most of the human miRNA precursors and could eventually include all of the predicted human miRNA genes once discovered. This assay should easily be adaptable to other organisms such as plants, C. elegans and Drosophila. The Applied Biosystems 7900HT sequence detection system used here was equipped with a 96-well block. This instrument is adaptable to a 384-well block that would increase the assay throughput by 4-fold. Therefore, the assay described here could rival the throughput of microarrays and could be advantageous compared to microarrays due to the increased sensitivity of the PCR. Sensitive PCR assays coupled with methods to capture individual cells such as laser-assisted microdissection can be used to study the cell-type regulation/expression of miRNAs in individual cell types.

Presentation of quantitative PCR data using red/green pseudocolors is a relatively recent phenomena. The only investigator to our knowledge to organize real-time PCR data in such a manner is Dittmer during the development of a genome-wide assay for all of the open reading frames of the Kaposi's sarcoma herpesvirus. It is practical to generate and present gene expression data in this manner only if the number of genes of interest in relatively small (<500). However if the number of genes is relatively small (such as miRNAs) then presentation of real-time PCR data in this manner accomplishes the same result as microarrays, including (i) high throughput analysis of gene expression and (ii) presentation of large amounts of data as pseudocolors to visualize differences in expression levels.

Pre-miRNA is processed to the ˜22 nt mature miRNA by Dicer-like enzymes in all species in which miRNAs have been identified. The method described here provides quantitative data on the miRNA precursors only and not on the mature miRNA. Using a transcriptional fusion of the let-7 promoter to gfp, it was shown that let-7 is temporally regulated by transcription and not by processing of the pre-miRNA or stability of the mature miRNA. In CLL patients and cancer cell lines, 23 of 60 samples showed the ˜70 nt miR-15a precursor that was not found in any normal tissues except bone marrow. The expression of Dicer was relatively constant in these patients, suggesting inefficient processing of the miRNAs in some CLL patients that was not related to Dicer expression. The precursors of 26 miRNAs were equally expressed in non-cancerous and cancerous colorectal tissue from patients. However, the expression of mature miR-143 and -145 (but not the other 24 miRNAs) was greatly reduced in cancerous tissue compared to non-cancerous tissue, again suggesting altered processing for specific miRNAs in human disease.

We demonstrate here that the expression of three miRNA precursors measured by the PCR assay (miR-21, -29 and -224) paralleled the expression of the mature miRNAs from Northern blots. In order to fully characterize the expression of large numbers of miRNAs, it may be necessary to quantify both the mature and miRNA precursors using sensitive assays such as the PCR. A major challenge in measuring the mature miRNA using RT-PCR is the small size of the mature miRNA (˜22 nt). There may be situations (such as in normal development) in which processing or stability of the miRNA is not regulated and the expression of the miRNA precursors reflect the levels of the active, mature miRNA. There may exist other circumstances (such as in human disease), where alterations in miRNA biogenesis produce levels of mature miRNA that are very different than the pre-miRNA. In the former situation, sensitive PCR assays such as the one described here, could be used to measure the miRNA precursor as a means to predict the levels of mature miRNA, while in the latter situation, sensitive assays will be necessary to measure the mature miRNA.

Example 2 Quantification of Pre-miRNA

The materials and methods are as described in Example 1. Briefly, PCR using the hairpin primers (FIG. 1) amplifies both the pri-miRNA and pre-miRNA. To amplify only the pri-miRNA, the antisense primer to the hairpin is used along with a new sense primer that is designed to anneal to a sequence upstream of the hairpin sequence of the pri-miRNA (FIG. 1). PCR using the hairpin primers (gray/black, FIG. 1) amplifies the pri-miRNA+pre-miRNA and PCR using the upstream primer along with the antisense hairpin primer (dashed/black, FIG. 1) amplifies only the pri-miRNA. The amount of pre-miRNA is then calculated using the equation:

pre-miRNA=2^(−C) ^(T) ^((pri-miRNA+pre-miRNA))−2^(−C) ^(T) ^(p-miRNA).

Quantification of Pri-miRNA and Pre-miRNA.

All of the miRNA primers were designed to anneal to the hairpin of the miRNA precursors (FIG. 1). A sense primer (5′ GGGCTTTAAAGTGCAGGG 3′) was designed to the pri-miR-18 (dashed, FIG. 1). This primer along with the antisense primer for miR-18 was used to amplify the pri-miR18. Real-time PCR was performed on the cDNA from the six cancer cell lines using primers for the miR-18 precursors and the pri-miR-18.

The C_(T) generated from the miR-18 precursors was slightly lower than the C_(T) for the pri-miR-18 (FIG. 5A). Differences in one C_(T) unit in real-time PCR data are typical when detecting a 2-fold difference in template. The amount of pre-miRNA was calculated as described in Materials and Methods for Example 1. The relative amounts of pre-miR-18, pri-miR-18 and total precursors (pri-miR-18+pre-miR-18) were determined in each of the six cancer cell lines (FIG. 5B). While more of the miR-18 precursors were expressed in K562 cells, the relative amounts of pri-miR-18 and pre-miR-18 are approximately equal in all six cell lines. This demonstrates that each pri-miRNA is processed to one pre-miRNA molecule and shows that there is no regulation of Drosha processing for miR-18 in these cell lines.

Example 3 Use Of TaqMan® MGB Probes to Distinguish miRNA Isoforms

Many of the discovered human miRNA genes are grouped in families of two or more nearly identical isoforms. The largest of the human families include let-7 (14 members) and miR-30 (6 members). miRNA isoform families may be one of two types. The first type is when the mature miRNAs have nearly identical sequences, usually differing by 1-3 nt. These families are designated with a letter (e.g. let-7b and let-7c). The second designation is for miRNA genes that produce the identical mature miRNA from a slightly different precursor gene (e.g. let-7a-1 and let-7a-2). These are designated with a number implying that both genes, let-7a-1 and let-7a-2, produce the identical mature miRNA (let-7a). Each isoform is usually located on different chromosomes. There is more sequence diversity in the precursor gene compared to the mature miRNA. For example, while miR-30c-1 and -30c-2 produce the identical mature miRNA, their precursor genes are only 79% identical. The greatest degree of sequence variation on the precursor miRNAs lies in the loop portion of the hairpin.

It is desirable to be able to detect and quantify the expression of only one specific isoform in samples that contain many members of a family of isoforms, e.g. the let-7 family members. To this send, an aspect of the current invention involves designing TaqMan® MGB probes to the loop portion of the miRNA precursor so as to allow discrimination of individual members of a family of miRNA isoforms. The following example illustrates such a method.

Part A of FIG. 12 shows the sequences of 11 members of the human let-7 microRNA family. The red and blue sequences depict the sequences of the forward and reverse primers, respectively for each gene. The sequences colored in yellow differ slightly (primer binding sites only are shown). The sequences in black that lie in between the red and blue primers are the potential sequences to which the TaqMan® probes may be designed to bind to. The purpose of the TaqMan® probe is to allow the PCR product to be detected by the real-time PCR instrument's fluorescent detector. Two things must happen in order for the TaqMan® probes to fluoresce, PCR must occur and the PCR enzyme must cleave the probe. In order for the TaqMan® probes to fluorescence, the probe must bind to 100% of the DNA sequence that lies in between the two primers.

TaqMan® probes are typically designed with a Tm that is 10° C. higher than that of the primers. As shown in FIG. 12, the space in between the primer annealing sites is very short (˜15-20 bp). The presence of the MGB allows the design of short TaqMan® probes with Tms ranging from 61-68°.

A TaqMan® MGB probe was designed to anneal to the loop portion of the miRNA precursor. The probe targeted the human let-7d sequence 6FAM-ATT TTG CCC ACA AGG A-MGBNFQ (double underlined sequence, FIG. 12B). This probe binds to the reverse complementary sequence in the human let-7d gene that lies in between the red and blue primer sequences. Performing PCR on DNA that contains the human let-7d sequence using the gene specific primers for let-7d and the TaqMan® probe fluoresces and will be detectable.

To demonstrate the specificity of detection for the let-7d TaqMan® MGB probe, the sequences of six miRNA isoforms (let-7a-1, let-7a-2, let-7a-3, let-7f-1, let-7f-2 and let-7d) were cloned into plasmids using PCR and TOPO TA cloning. The identity of the sequences was verified by DNA sequencing. Real-time PCR was preformed using gene specific primers to each of the six let-7 isoform plasmids as well as a no template control. The TaqMan® MGB probe for let-7d was included in each of the PCRs. The results show that only the PCR with the let-7d plasmid was detected (FIG. 13A). To demonstrate that amplification of template occurred in each reaction, a sample of each reaction was run on an agarose gel (FIG. 13B). Therefore while amplification occurred in each of the six reactions, only the PCR with the TaqMan® MBG probe detected the amplicon. This demonstrates the specificity of the detection of TaqMan® MGB probes when similar sequences are amplified.

Example 4 Expansion of miRNA Precursor PCR Assay

The total number of miRNA precursor assays was expanded to include 201 of the known human miRNAs as of the date of this application. To demonstrate the usefulness of the assay, the expression of 201 miRNAs were screened on samples of RNA from various human tissues (colon, pancreas, ovary, lung and brain). The primers used are presented in FIG. 15 and the results of the analysis is presented in FIG. 14.

Materials and Methods

Cell Lines, Tissues and Tissue Culture.

The following human tumor cell lines were used: K-562 (chronic myelogenous leukemia), HL-60 (promyeolocytic leukemia), Daudi and Ramos (Burkitt lymphoma), Jurkat (T-cell leukemia); LNCaP, PC3, PPC-1, DU145 and TSU-PR1 (prostate); SCC17A, SCC17B, SCCD12, SCC10B and SCC5 (head & neck squamous cell carcinoma); MDA231, T47D, SKBR3, MDA361 and MCF7 (breast cancer); SW620, HCT8, HCT116, HT29 and HCT15 (colorectal carcinoma); Panel and Hs 766T (pancreatic); H23, H522, HOP62, A549 and H719 (lung cancer); RH30, RH3, CW9019, SMS-CTR and RD2 (rhabdomyosarcoma) and SK-Hep1, PLC/PRF5, SNU387, SNU449 and H719 (liver cancer). Cells were obtained from American Type Culture Collection (Manassas, Va., USA) or were obtained from various laboratories. Cancer cell lines were cultured in a humidified atmosphere of 95% air, 5% CO2 using RPMI 1640 or other suitable media and 10% fetal bovine serum. Total RNA from normal human liver and skeletal muscle tissue was purchased from Ambion (Austin, Tex.). Hepatocellular carcinoma tumors were received from Dr. Lewis Roberts, Mayo Clinic, Rochester, Minn.

Primers and TaqMan® MGB Probes.

Primers were designed to all of the known human miRNAs as of December, 2004. These 222 miRNA genes include 38 families of isoforms. Many of the miRNA isoforms differed by only 1-3 bp in the primer binding sequence (FIG. 12). If the difference in sequence occurred towards the 5′ end of the primer, then the same pair of primers was used to amplify both isoforms. If the sequence difference occurred towards the 3′ end of the primer or there were multiple differences, then a unique pair of primers was designed to each isoform. Although we refer to the expression of 222 miRNA precursors, in actuality, primers were designed to and data are presented on 201 miRNA precursors since several isoforms were amplified by the same pair of primers.

Primers were designed to the primary precursor molecule for several miRNAs. These are designated by the letter “P” in FIG. 15. Primers were designed to the primary precursor if we were unable to successfully design primers to the hairpin-containing precursor. Unsuccessful primer design was defined as either an inability to amplify genomic DNA or detection of multiple products using SYBR® green. The later example could be alleviated using TaqMan® probes. In addition, some primers were designed to the primary precursors of miRNA isoforms. Primers were designed using Primer Express version 2.0 (Applied Biosystems, Foster City, Calif.) using the criteria previously described in EXAMPLE 1. TaqMan® MGB probes were designed using Primer Express software. Probes were designed to have a 5′ FAM and a MGB at the 3′ end. TaqMan® MGB probes were synthesized by Applied Biosystems. Sequences of the TaqMan® MGB probes are listed in Table 2. Primers were validated on human genomic DNA (Roche), mouse genomic DNA, cDNA synthesized from Universal Human Reference RNA (Stratagene) and no template control reactions.

TABLE 2 TaqMan ® MGB probes to members of let-7 family of isoforms Tm Gene Sequence (5′→3′) (° C.) let-7a-1 5′-FAM-CACCCACCACTGG-MGB 3′ 61° let-7a-3 5′-FAM-CTCTGCCCTGCTATG-MGB 3′ 67° let-7b 5′-FAM-AGTGATGTTGCCCC-MGB 3′ 65° let-7c 5′-FAM -AGTTACACCCTGGGA-MGB 3′ 62° let-7d 5′-FAM-ATTTTGCCCACAAGGA-MGB 3′ 67° let-7e 5′-FAM-ACACCCAAGGAGATC-MGB 3′ 67° let-7f-1 5′-FAM-TTACCCTGTTCAGGAG-MGB 3′ 63° let-7f-2 5′-FAM-TACCCCATCTTGGAG-MGB 3′ 63° let-7g 5′-FAM-TACCACCCGGTACAGGA-MGB 3′ 68° let-7i 5′-FAM-ATTGCCCGCTGTGGA-MGB 3′ 67°

RNA Extraction, DNA Extraction and Reverse Transcription.

cDNA was synthesized from total RNA using gene specific primers as described in EXAMPLE 1. The gene specific primers included a mixture of each of the antisense primers to all of the miRNAs and U6 RNA listed in the FIG. 15. Following an 80° C. denaturation step and 60° C. annealing, the cDNA was reacted for 45 min at 60° C. as described in EXAMPLE 1. Genomic DNA from NIH 3T3 mouse fibroblasts was isolated as described in Sharma, R. C., et al., A rapid procedure for isolation of RNA-free genomic DNA from mammalian cells. Biotechniques 1993. 14(2):176-8.

Real-Time PCR.

The expression of the miRNA precursors was determined using real-time quantitative PCR as described in EXAMPLE 1 with several modifications. Three μl of a master mix containing all of the reaction components except the primers was dispensed into a 384-well real-time PCR reaction plate (Applied Biosystems) using a 12-channel repeating pipette (Model EDP3-Plus, Rainin Instruments, Woburn, Mass., USA). The master mix contained 0.5 μl of 10×PCR buffer, 0.7 μl of 25 mM MgCl2, 0.1 μl of 12.5 mM dNTPs, 0.01 μl UNG, 0.025 μl Amplitaq Gold DNA polymerase, 0.5 μl of dilute cDNA (1:50) and water to 3 μl. All of the PCR reagents were from the SYBR® green core reagent kit (Applied Biosystems). A 2 μM solution of each pair of primers listed in FIG. 15 was stored in 12-well PCR strip tubes. Two μl of each primer was dispensed into duplicate wells of the 384-well plate using the 12-channel repeating pipette. Everything was identical for the TaqMan® assays except the TaqMan® core reagent kit (Applied Biosystems) and 200 nM of the TaqMan® MGB primers were used. Each miRNA listed in FIGS. 15 and U6 RNA was assayed in duplicate in the 384-well reaction plate. Real-time PCR was performed on an Applied Biosystems 7900HT real-time PCR instrument equipped with a 384-well reaction block. PCR was performed for 15 seconds at 95° and one minute at 60° C. for 40 cycles followed by the thermal denaturation protocol. TaqMan® and SYBR® green assays may be run simultaneously on the 7900HT real-time instrument. The expression of each miRNA relative to U6 RNA was determined using the 2-ΔCT method. To simplify the presentation of the data, the relative expression values were multiplied by 105.

Validation of miRNA Precursor Primers, SYBR® Green.

Each pair of primers listed in FIG. 15 was validated on human genomic DNA, cDNA synthesized from Universal Human Reference RNA, mouse genomic DNA and no template control reactions. All of the primers listed in FIG. 15 worked successfully on human genomic DNA (not shown). Successful amplification was defined by the presence of a single dissociation peak on the thermal melting curve. For those reactions that produced multiple dissociation peaks, a new pair of primers were designed to the primary precursor miRNA. These primers are listed with the designation “p”, e.g. miR-9-1(p) (FIG. 15). Many of the miRNA genes that required priming of the primary precursor were miRNA genes with known isoforms (e.g. miR-9-1, -19b-1, -106a). About 70% of the human primers successfully amplified mouse genomic DNA (FIG. 15). The ability of primers to amplify both between human and mouse miRNA genes is likely due to the similarity in sequence among these genes. Human miRNA primers were not tested on mouse cDNA.

miRNA Precursor Expression Profiling in Cancer Cell Lines.

The expression of 222 miRNA precursors was profiled in 32 commonly used cell lines of lung, breast, head & neck, colorectal, prostate, pancreatic and hematopoietic cancers. Gene expression data was normalized to U6 RNA. U6 was validated as an internal control by comparing its expression levels in each of the cell lines. U6 RNA was consistently expressed in each of the 32 cell lines, thus U6 RNA is an acceptable internal control for quantitative PCR in these cell lines.

The relative expression was determined for each of the 222 miRNA precursors. The relative expression for all 222 miRNA precursors was clustered using unsupervised hierarchical clustering and presented as a heatmap (FIG. 14). Unsupervised hierarchical was performed on the data presented as Δ CT. The heatmap and dendrogram demonstrate that most of the cell lines clustered into their respective tissues from which each cell line was ostensibly derived (FIG. 14). Five of five hematopoietic and head & neck cell lines and two of two pancreatic cell lines produced unique clusters. Four of five lung and colorectal cell lines produced unique clusters as well. The breast cancer and prostate cancer cell lines tended to cluster together with 4 of 5 of the prostate cell lines forming a cluster (along with one breast cancer cell line) and 3 of 5 breast plus one prostate forming another cluster (FIG. 14B).

Full details of the presented EXAMPLE 4 is provided in Jiang, J, Lee, E J, Gusev Y and Schmittgen T, Real-time expression profiling of microRNA precursors in human cancer cell lines, Nucleic Acid Research, 33(17):5394-5403 (2005), the contents of which are incorporated herein by reference. 

1. A method for identifying the expression of both pri-microRNA and pre-microRNA precursors in a sample, said method comprising the initial step of using a gene-specific reverse primer to reverse transcribe a target nucleotide sequence, wherein the target nucleotide sequence comprises a substantial portion of a hairpin sequence shared by both the pri- and the pre-microRNA. 2-34. (canceled) 